This repository includes

- yt_1000_de_processed.tsv: the main data set
- top50_domains.tsv: the referral link coding for the top 50 domains as explained in the article
- yt_100_de_autosubs.tsv: all available auto-generated captions for videos from the top 100 channels
- replication_code.R: the R code for replication

Please note that as the code requires quite a number of httr requests, it is possible that some urls are not retrievable any more at later times.
For this reason, the variables related to affiliate link counts and the video captions are already included in this repository.


The following R environment was used for all analysis:


# R version 3.4.4 (2018-03-15)
# Platform: x86_64-pc-linux-gnu (64-bit)
# Running under: Linux Mint 18.3
# 
# Matrix products: default
# BLAS: /usr/lib/libblas/libblas.so.3.6.0
# LAPACK: /usr/lib/lapack/liblapack.so.3.6.0
# 
# locale:
#   [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8   
# [6] LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C            
# [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
# 
# attached base packages:
#   [1] stats     graphics  grDevices utils     datasets  methods   base     
# 
# other attached packages:
#   [1] stargazer_5.2.1    stm_1.3.3          cowplot_0.9.2.9900 scales_0.5.0.9000  lubridate_1.7.2    quanteda_1.0.3     urltools_1.7.0    
# [8] httr_1.3.1         forcats_0.2.0      stringr_1.2.0      dplyr_0.7.4        purrr_0.2.4        readr_1.1.1        tidyr_0.8.0       
# [15] tibble_1.4.2       ggplot2_2.2.1.9000 tidyverse_1.2.1   
# 
# loaded via a namespace (and not attached):
#   [1] network_1.13.0      reshape2_1.4.3      haven_1.1.1         lattice_0.20-35     colorspace_1.3-2    yaml_2.1.16         rlang_0.2.0.9000   
# [8] pillar_1.1.0        foreign_0.8-69      glue_1.2.0          withr_2.1.2         modelr_0.1.1        readxl_1.0.0        bindrcpp_0.2       
# [15] bindr_0.1           plyr_1.8.4          munsell_0.4.3       gtable_0.2.0        cellranger_1.1.0    rvest_0.3.2         psych_1.7.8        
# [22] parallel_3.4.4      triebeard_0.3.0     broom_0.4.3         Rcpp_0.12.15        spacyr_0.9.6        RcppParallel_4.3.20 jsonlite_1.5       
# [29] fastmatch_1.1-0     stopwords_0.9.0     mnormt_1.5-5        hms_0.4.1           stringi_1.1.7       ggrepel_0.7.0       grid_3.4.4         
# [36] cli_1.0.0           tools_3.4.4         magrittr_1.5        lazyeval_0.2.1      crayon_1.3.4        pkgconfig_2.0.1     Matrix_1.2-12      
# [43] data.table_1.10.4-3 xml2_1.2.0          assertthat_0.2.0    rstudioapi_0.7      R6_2.2.2            nlme_3.1-131        compiler_3.4.4 
